Acquiring and updating hierarchical knowledge for machine translation based on a clustering technique

نویسندگان

  • Takefumi Yamazaki
  • Michael J. Pazzani
  • Christopher J. Merz
چکیده

This paper addresses the problem of constructing a semantic hierarchy for a machine translation system. We propose two methods of constructing a hierarchy: acquiring a hierarchy from scratch and updating a hierarchy. When acquiring a hierarchy from scratch, translation rules are learned by an inductive learning algorithm in the rst step. A new hierarchy is then generated by applying a clustering method to internal disjunctions of the learned rules and new rules are learned under the bias of this hierarchy. When updating an existing manually-constructed hierarchy, we take advantage of its node structure. We report experimental results showing that the semantic hierarchies generated by our method yield learned translation rules with higher average accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روش نوین خوشه‌بندی ترکیبی با استفاده از سیستم ایمنی مصنوعی و سلسله مراتبی

Artificial immune system (AIS) is one of the most meta-heuristic algorithms to solve complex problems. With a large number of data, creating a rapid decision and stable results are the most challenging tasks due to the rapid variation in real world. Clustering technique is a possible solution for overcoming these problems. The goal of clustering analysis is to group similar objects. AIS algor...

متن کامل

ADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE

The  tunnel  boring  machine  (TBM)  penetration  rate  estimation  is  one  of  the  crucial  and complex  tasks  encountered  frequently  to  excavate  the  mechanical  tunnels.  Estimating  the machine  penetration  rate  may  reduce  the  risks  related  to  high  capital  costs  typical  for excavation  operation.  Thus  establishing  a  relationship  between  rock  properties  and  TBM pe...

متن کامل

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995